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1. INTRODUCTION 

The first car, named Navlab, coupled with computer vision and a smart steering system, emerged 
in the 1980s at Carnegie Mellon University [I]. Since then, several attempts have been made to make fully 
autonomous vehicles become safer, more efficient, and environmentally responsible. One of the most advanced 
smart embedded systems nowadays is found in [2], where a self-driving car has driven more than sixteen million 
kilometers autonomously. 

The main tasks when developing an autonomous vehicle system are summarized as follows: environ- 
ment perception, mapping and localization, motion planning, decision, and control. Through images captured 
by one or more cameras, lidar, and other useful sensors, the perception task is designed to detect and understand 
the local environment where the vehicle is driving. Some studies comprising the perception subject are found 
in [B], [5]. These tasks are carried out through modules presented in embedded smart systems equipped in 
an autonomous vehicle and are developed using scientific methodologies based on ”model-based”’,, *machine 
learning based” or *hybrid-based” control methods. 
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An optimal collision-free path, from the current vehicle state to the desired state, is generated by mo- 
tion planning methods and can be divided into global planners or local planners, where local information is 
considered. These methods consist of candidate trajectories calculation based on the kinodynamic model and 
then selecting the best one considering the safeness and other relevant assumptions. Such methods were pre- 
sented in some DARPA Urban Challenge team cars [6]. A classic reactive motion planning dynamic windows 
approach (DWA) proposed by |7], evaluates pair of velocities selection by an objective function optimization, 
considering a short time and descanting any velocities which generate a path with collision according to a 
minimum distance. 

In recent years machine learning has gained space in the robot field, with many works in path planning 
and control. In two neural networks are designed to find the free navigation space and the trajectory for 
a mobile robot. Reinforcement learning combined with the DWA path planning method is proposed by [9], 
where the Q-learning algorithm is used for DWA adaptive function weights adjustment for each evaluation 
task. Related works where machine learning is combined in robot path planning are shown in [10], (12). 

Model-based navigation control methods are designed to control the vehicle based on the selected 
motion paths. Stanley method, first introduced in the DARPA Grand Challenge [13], model predictive control 
(MPC), fuzzy control and preview control are some techniques still in the research field. Machine learning 
techniques also have been applied in robot control methodologies, where in an end-to-end navigation 
control system based fully on machine learning is designed. 

Geometric-based autonomous navigation systems are designed assuming certain conditions, getting 
computationally expensive. Navigation systems based on machine learning can improve performance, but on 
the other hand, wrong outputs may happen in unseen situations, leading the vehicle to undesired motions. 
Hybrid methodologies, combining geometric-based navigation models with machine learning can raise good 
results, taking the advantage of both methods. 

This paper presents a hybrid controller methodology for lane centering with obstacle avoidance. This 
controller is inspired by the image-based dynamic window approach (IDWA) [16], where autonomous naviga- 
tion is done through visual features and uses a modified DWA function to evaluate the reactive control. Using 
camera images, the convolutional neural network is trained in order to segment road lane lines. Visual features 
are extracted from these segmented images, and another neural network model is trained to predict pair of 
velocities to be applied on the vehicle. When this value leads the vehicle to collision, the reactive control is 
performed in order to find and select the best collision-free path according to the modified DWA optimization 
function called the image-based reduced dynamic window approach (IRDWA). A third machine learning model 
is used for the reduced dynamic-window functionality, which aims to reduce the optimum velocities of search 
space. Each aspect of the proposed system is presented in the next section. 

This paper is structured as follows. Section 2 presents the general aspects of the proposed system, with 
a brief explanation of the high- and low-level controller. Then, in section 3 the High control block, which is the 
focus of this work, is explained in more detail. The proposed system is evaluated on simulation, where a car 
equipped with a camera and a single-layer lidar sensor must accomplish lane-keeping and obstacle avoidance 
tasks. All the results are presented in section 4 and a conclusion about this work is in section 5. 


2. MODELING ASPECTS: GENERAL OVERVIEW OF THE SYSTEM 

Two main coordinate frames represent the camera and LIDAR position and are shown in Figure 
The former is located in the car’s front roof (camera frame), while the latter is placed in front of the car (World 
frame). The relative displacement of the camera frame to the world frame is tz (in Zw axis), ty (in Yw axis), 
and its relative rotation in Xw axis is ¢. The camera is pointed along the Zc axis in order to capture images 
comprising the lanes road. 

The general system can be divided into 3 main blocks as shown in Figure [2] The first block corre- 
sponds to all sensors and actuators equipped in the vehicle, providing current state information such as the 
linear velocity (V;), yaw rate (W+), camera image, and 2D point cloud provided by the single layer lidar sen- 
sor. Then, based on the kinematic bicycle model |17], the high-level controller processes all this information 
in order to find the next pair of velocities to be applied in the vehicle. The desired longitudinal velocity is 
pre-configured, but when the reactive control is triggered, this velocity is adjusted according to the current 
scenario (IRDWA optimization step, as explained in section [3.4.p. The last block, the low-level controller, has 
the task of controlling the vehicle steering, throttle, and brake aiming to make the vehicle achieves the desired 
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velocities (V;+ı and W;+1). Each block is explained in the next sections. The desired longitudinal velocity is 
pre-configured, so the system tends to keep this velocity, adjusting its value according to the current scenario. 
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Figure 2. System overview 
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3. HIGH CONTROL: ENVIRONMENT PERCEPTION AND VELOCITIES ESTIMATION 

All the steps compounding the high-level controller are summarized in Figure B] The diagram shows 
the sequences of the whole process. There are four steps. It begins with lane line detection and tracking and 
is followed by control parameters estimation. The next step is yaw rate finding. Finally, it fins the optimal 
velocities with the IRDWA method. 
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Figure 3. High level control diagram 
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3.1. Step 1: Lane lines detection and tracking 
For the lane lines detection and tracking, the proposed steps are shown in Figure This task is 
achieved in 2 steps. They are the lane lines detection step and the tracking step. 
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Figure 4. Lane lines detection and tracking diagram 


3.1.1. Lane lines’ detection 

The first step is lane lines’ detection in the images provided by the camera and is divided into two 
parts. The first part creates binary masks, one per line from the raw image, and the second extract a model for 
each line. The binary masks’ prediction process is illustrated in the block diagrams as shown in Figure[5] 
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Figure 5. Binary masks predictions process 


The binary masks are predicted by two deep-learning models. These models are trained and tested 
on the CuLane dataset [18]. Before the prediction, the image is preprocessed, pixel values are normalized and 
only the region of interest, the lower part of the image, is kept. 

The first model predicts the future binary masks, this prediction is carried out using the autoencoder 
model. The autoencoder is composed of two parts: the encoder part and the decoder part. The encoder part 
compresses and selects information inside the image with successive convolution/pooling layers. Once features 
are extracted from the raw image, the decoder replaces this information in the initial space dimension and 
creates 4 binary masks, each one representing a lane line. Figure[6|shows the input and output shape of these 
masks. 
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input image: output masks: 
(64,320,3) (4,64,320) 


Figure 6. Input and output shapes 


The second model was trained to predict the future binary masks from previous masks binary. This 
part allows us to take into consideration previous predictions and improve the robustness of the autoencoder 
model. The model is a convolutional LSTM network [19]. 


Wauto-MaSkSauto + Wtrack-MASk Strack 
masks final = (1) 
Wauto + Wtrack 
Figure [7| shows an example of prediction, each color line represents activated pixels from the same 
mask. As explained above, the tracking process improves the robustness of the prediction. This process can 


help to fill partial prediction (Figure[8) or remove outliers. 


Figure 7. Prediction lines example 


Figure 8. Tracking robustness example, top image shows predictions without tracking and bottom image 
shows predictions with tracking. 


3.1.2. Lane lines tracking 


In situations like when the vehicle is in the middle of two lanes, in the case where the car is changing 
lanes (for overtaking an obstacle for example), the segmented image tends to present more than one lane line, 
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needing a selection of the correct one. Thus, on each segmented image, all activated pixels are clustered using 
the agglomerative clustering algorithm [15], where each cluster is treated as a line candidate and the best one is 
chosen to represent the correct line points. To keep tracking the 4 lane lines along the way, the best line means 
the one with less error (error, defined by |2) according to the corresponding line position and inclination 
calculated in the previous image frame. 


error, = €coef + Q ` Eint (2) 


Lane lines that are not detected in the segmentation task are discarded and the most distant detected lines 
are used as road obstacles. These obstacles mean the road limits, image frame to world frame conversion is 
necessary to transform these lines into obstacles (road plane surface is considered). 


3.2. Step 2: Control parameters estimation 

Right after lane lines are selected in the previous step, visual parameters are extracted from the image 
as it is shown in Figure P] These visual parameters are those used in the visual-based controller proposed by 
[16]. Point P is localized in the middle lane line (where the vehicle must keep driving) at a predefined vertical 
distance Y to the image bottom. Parameter X is the horizontal distance of the point P to the image center. The 
angle between the lane line center tangent and the vertical axis is the last parameter 9. X and 0 values are then 
sent to the yaw rate finding, which is explained in the next section. 


Figure 9. Visual parameters 


3.3. Step 3: Yaw rate finding 

The velocity W;+1 that the vehicle must achieve in order to keep driving on the desired lane, given 
the desired longitudinal velocity, is estimated using a neural network model. This machine learning model was 
trained by supervised machine learning with data gathered using CARLA simulator (on map 07), where a 
car drove along a road keeping in the lane center. This model has as input the extracted visual parameters X 
and 0, the current linear velocity V;, and the current yaw rate W;. 

The predicted yaw rate and the desired longitudinal velocity are then checked if they drive the car to 
the collision. This is done by calculating the distance to collision, as proposed by [21], with all obstacles in the 
2D point cloud provided by the lidar. If the distance is less than a threshold value, a reactive control (IRDWA 
block) finds optimum velocities in order to avoid obstacles and keep the vehicle the closest to the lane center. 
Otherwise, if the distance to collision is bigger enough, the predicted velocities are considered. 


3.4. Step 4: Finding optimal velocities with IRDWA method 

When the collision is detected, a new pair of velocities (Vii1 and W;+1) are found by maximizing the 
proposed IRDWA objective function as shown in] Vmin and Vmaz are the minimum and maximum desired 
velocity, Wmax is the maximum desired yaw rate, Deo is the distance to collision and Dmin is the minimum 
allowed distance to collision. For each possible velocity belonging to a search space as shown in Figure 
the distance to the collision on each obstacle point is calculated, and the minimum value is considered to 
be the Deo value for these velocities. Vp, and Wp, are the maximum break accelerations, ac is the maximum 
acceleration and dt the time between two high control timestamps. 


IRDWA = gainy, - Velocity, + gainpist - Dist 


3 
+ gainy2 - Velocity @) 
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where 
Dist = coll 
a Dmin 
: W- W, 
Velocity, = 1 — Mim 
Vmaz Ez V; . 
a if V% > Vi+1 
Velocityz = max a t+1 
————  ,ifV< Vj 
Vivi = Vmin i Aa 
{(Vi+1, Wi+1)} E Vmat N Va N Vs N Wmas N Ws (4) 
where 


Va < V + ac: dt 
V: + ac- dt > Va > Vi — Vor- dt 


Vs < V2: Deo > Vor 
Ws < V 2. Deott j Wor 


As the number of obstacles increases more calculations must be done for each evaluated velocity. For 
this reason, a reduction in the velocities search space can optimize this process by reducing the total number of 
calculations (less computational expense), or by making a more accurate search as the search space is reduced 
including the optimal values. An example of this search space is shown in Figure[10| where a heatmap shows 
the regions with higher IRDWA values. 
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Figure 10. Search space 


Supervised machine learning is used to accomplish the task of the proposed search window reduction. 
Therefore, a scalable tree boosting algorithm called XGBoost classifier model is trained in order to predict 
the range of yaw rate values containing the optimum values. This machine learning-based model receives 
as input data: a flattened 2D occupancy grid containing all obstacle points around the vehicle, the current 
velocities (V; and W+), and the values X and Theta provided by the control parameters estimation block. The 
occupancy grid is formed by the collected single-layer LIDAR points, in 2D space, covering a plane surface 
around the vehicle where the sensor is placed in front of the car. 


4. VALIDATION RESULTS 

The final system is evaluated using CARLA environment simulation on map 04. A computer with 
the following configuration is used to run the simulation together with the implemented proposed method: i7- 
6700HQ Processor and Nvidia Geforce GTX 970M graphic card. Using 0.1s and 0.5s for the low and high 
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controller timestep respectively, a minimum distance to the collision of 10 m and desired velocity equal to 4 
m/s are considered to evaluate the results. 

In the following subsection, the results of the learning step for the yaw rate prediction are shown, 
where, among different trained models, the best one is selected and tested in simulation for the lane centering 
task. In the next subsection, obstacles are placed in the road, where the high controller must drive the vehicle 
along the lane performing obstacle avoidance. Firstly, to make use of the reduced dynamic window from the 
IRDWA controller, a machine-learning model is trained and its results are shown. Then, different optimization 
methods for IRDWA maximization are used for comparison purpose, where each method are tested with and 
without the reduced dynamic window (activated and not activated). 


4.1. Lane keeping 

The training dataset containing visual features and the current velocities for yaw rate prediction was 
gathered from driving a vehicle on the lane center along the road of map 07, from the CARLA simulator. 
Situations, where the car is not in the lane center and must return to it, were also considered. Figure[I1]shows 
the yaw rate histogram, which dataset has 6,804 data. 
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Figure 11. Yaw rate histogram 


Supervised machine learning models were trained with the dataset applying different methods, and 
the results are shown in Table [I] Mean squared error (M SE) and accuracy (Acc) are the scores used for the 
best model selection. The Acc corresponds to the model response rate with an error of less than 0.5 m/s. The 
scores are evaluated both on training data and test data, preprocessing is done in the dataset (normalization or 
standardization) and the best hyperparameters were found by trying different settings and selecting the ones 
resulting in higher test accuracy. 


Table 1. ML scores 


Method Preproc. Parameters Train score Test score 
height MSE Acc MSE Acc 
SVR Norm. C=100, degree=0.5, kernel=rbf 0.011 0.878 0.011 0.880 
Ridge Norm. alpha=1 0.021 0.776 0.021 0.779 
K Neighbors Stand. Leaf size=3, neighbors=8 0.008 0.900 0.011 0.892 
Random Forest Norm. Max. depth=13, estimators=300 0.002 0.965 0.010 0.884 
Elastic Net Stand. Alpha=0.1, 11 ratio=0.5 0.262 0.744 0.272 0.751 

activation=tanh, batch size=64, 

Neural Network Stand. learning rate=0.01, solver=sgd, 0.119 0.893 0.118 0.886 


hidden layersizes = (500, 400, 200, 50) 


Higher scores on test data mean a better generalization. As K-neighbors, random forest and neural 
network regression models presented good scores compared to the other methods. These three methods were 
then tested in a lane-keeping task. The results are shown in Table [2] 


Table 2. Lane keeping results 


Method Maximum offset Offset mean 
K Neighbors 1.275 0.139 
Random Forest 1.097 0.235 
Neural Network 0.960 0.240 
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During each simulation, the vehicle drove on a 2800,00 meters road long. For all three tests, the 
vehicle was capable to keep the lane with a maximum offset to the lane center as shown in Table [2] The method 
which presented the lowest maximum offset was the neural network, and for this reason, was chosen to be used 
in the High control during the obstacle avoidance test. 

Figure [I2] shows the vehicle trajectory and its respective lane center offset in meters using a neural 
network model. The trajectory color represents the lane center offset, and its values are according to the sidebar 
scale. The points where the car is more distant from the lane center are colored yellow, and the points where 
the car is centered in the lane are purple. The visual parameter values X and © and the output from the high 
control (Yaw rate), collected in the trajectory segment with the maximum offset, are shown in Figure [13] 
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Figure 12. Trajectory 
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Figure 13. Maximum offset trajectory segment 


4.2. Obstacle avoidance task 

Dataset histograms used for the reduced dynamic window model training are displayed in Figure [14] 
Confusion matrix as shown in Figure [I5] illustrates the performance of the final model, where the numbered 
vertical and horizontal axis represent the classes corresponding to their respective range of yaw rate values. 
The correct number of predictions are shown in diagonal from the confusion matrix. The F1 scores achieved 
by this model with train data is 0.996 and with test data is 0.826, which are good scores as the highest value is 
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1 (100% assertiveness). This model is then used to perform obstacle avoidance tasks, where the controller is 
validated and compared with and without the reduced dynamic window module activated. 
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Figure 14. Dataset histograms 
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PREDICTION 


Figure 15. Confusion matrix 


Tables[3}and|4|show the results achieved by running the final system on a simulated car driving along a 
road avoiding a single obstacle and multiple obstacles. Different optimization methods for DWA maximization 
are used, where different parameters are tested: Velocities value step between iterations for exhaustive opti- 
mization and population size for the other methods. The first column displays the computer processing time 
in seconds spent by each test set (optimization method, its parameters, and if the reduced dynamic window 
module is activated or not), and the corresponding mean of IRDWA values during the trajectory is shown in the 
second column. 
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Table 3. Optimization results: single obstacle 


Case Time(s) IRDWA Method Parameters SWR 
. W step: 0.05 
1 02.835 12.220 Exhaustive V step: 0.15 No 
. W step: 0.05 
2 01.298 10.151 Exhaustive V step: 0.15 Yes 
3 04.426 10.404 Dif. Evol. pop 15 No 
4 18.298 12.631 Dif. Evol. pop 15 Yes 
5 01.858 10.760 Dif. Evol. pop 03 No 
6 04.639 13.031 Dif. Evol. pop 03 Yes 
7 01.177 09.936 Part. Swarm. pop 25 No 
8 03.371 11.829 Part. Swarm. pop 25 Yes 
9 00.521 10.100 Part. Swarm. pop 05 No 
10 00.776 10.605 Part. Swarm. pop 05 Yes 
Table 4. Optimization results: multiple obstacles 
Case Time(s) IRDWA Method Parameters SWR 
. W step: 0.05 
1 02.456 12.257 Exhaustive V step: 0.15 No 
2 01.207 10087 Exhaustive “SHPO yes 
V step: 0.15 
3 05.161 12.401 Dif. Evol. pop 15 No 
4 15.955 12.659 Dif. Evol. pop 15 Yes 
3 01.693 10.849 Dif. Evol. pop 03 No 
6 05.181 13.019 Dif. Evol. pop 03 Yes 
7 00.817 09.201 Part. Swarm. pop 25 No 
8 01.380 11.325 Part. Swarm. pop 25 Yes 
9 00.471 08.582 Part. Swarm. pop 05 No 
10 00.830 11.152 Part. Swarm. pop 05 Yes 


Analyzing these tables and comparing the optimizer methods’ performance with and without search 
window reduction, it is clear that the search window reduction enables a faster optimization in some cases 
(Case 2 on tables B]and/4), and also with an increase on IRDWA mean value (Case 7 and 10 on Table [3) when 
population size is reduced. In some cases, higher IRDWA mean values are achieved with small changes in time 
(Case 7 and 10 on Table|4) when making the optimizer parameter simpler. 

Figure [16] presents the trajectory made by the vehicle for some of the cases as shown in the tables, 
where the numbers represent the current frame or timestamp. Figure Esta) is the resulted trajectory of Case 1 
from Table 3, KO is the resulted trajectory of Case 2 from Table 3, Figure KO is the resulted trajectory of 
Case 7 from Table 4, and Figure KG is the resulted trajectory of Case 10 from Table 4. The vehicle trajectory 
as well as the position of the obstacles were generated according to the coordinate points as collected during 
simulations. The velocities along the trajectory are shown in a where Figure Ea corresponds to 
Case 1 from Table 3, Figure [17[b) to Case 2 from Table 3, Figure|17{c) to Case 7 from Table 4, and Figure 
WG) to Case 10 from Table 4. Samples of captured image frames are shown in Figure[I8]and Figure[19] where 
the timestamps order is from top left to bottom right. 


Figure 16. Resulted trajectory from tests in (a) case 1 from Table 3, (b) case 2 from Table 3, (c) case 7 from 
Table 4, and (d) case 10 from Table 4 
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Figure 17. Resulted vehicle velocities from tests in (a) Case 1 from Table 3, (b) Case 2 from Table 3, (c) Case 
7 from Table 4, (d) case 10 from Table 4) 
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Figure 18. Images of case 7 from Table|4] 
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Figure 19. Images of case 10 from Table|4] 


5. CONCLUSION 

This work proposed a combination of machine learning and a model-based controller. Successfully 
completing the validation tests, the results showed the learning capability for both lanes centering tasks and 
velocities search window reduction. The proposed system also showed its safety against obstacles collision, 
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carried out by its reactive functionality. More work must be done to make this system more robust, for example, 
consider vehicle dynamics in the IRDWA optimization step, enabling the vehicle drives with higher velocities. 
Also, more data and features can be added to the training data set, for both machine learning used in the system. 
For the yaw rate prediction, more vehicle dynamics information can be extracted. And for the reduced dynamic 
window machine learning model, complex scenarios can be covered in the data set. 
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